Multi-modal Interactive Video Retrieval with Temporal Queries

نویسندگان

چکیده

This paper presents the version of vitrivr participating at Video Browser Showdown (VBS) 2022. already supports a wide range query modalities, such as color and semantic sketches, OCR, ASR text embedding. In this paper, we briefly introduce system, then describe our new approach to queries specifying temporal context, ideas for color-based sketches in competitive retrieval setting novel pose-based queries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-modal query expansion for video object instances retrieval

In this paper we tackle the issue of object instances retrieval in video repositories using minimum information from the user (e.g., textual description/tags). Starting for a set of tags, images containing the object of interest are crawled from popular image search engines and repositories (e.g., Bing, Fickr, Google) and the positive and most representative instances of the object are automati...

متن کامل

Multi-modal Classifier Fusion for Video Shot Content Retrieval

In this paper we present a new chromosome to solve the problem of classifier fusion using genetic algorithm. Experiments are conducted in the context of TRECVID. In particular we focus on the feature extraction task that consists in retrieving video shots expressing one of predefined semantic concepts. Three modalities (visual, textual and motion) and two features per modality are used to descr...

متن کامل

Interactive Multi-Modal Robot Programming

As robots enter the human environment and come in contact with inexperienced users, they need to be able to interact with users in a multi-modal fashion—keyboard and mouse are no longer acceptable as the only input modalities. This paper introduces a novel approach to program a robot interactively through a multi-modal interface. The key characteristic of this approach is that the user can prov...

متن کامل

Multi-modal Medical Image Retrieval

Images are ubiquitous in biomedicine and the image viewers play a central role in many aspects of modern health care. Tremendous amounts of medical image data are captured and recorded in digital format during the daily clinical practice, medical research, and education (in 2009, over 117,000 images per day in the Geneva radiology department alone). Facing such an unprecedented volume of image ...

متن کامل

Multi-Modal Fashion Product Retrieval

Finding a product in the fashion world can be a daunting task. Everyday, e-commerce sites are updating with thousands of images and their associated metadata (textual information), deepening the problem. In this paper, we leverage both the images and textual metadata and propose a joint multi-modal embedding that maps both the text and images into a common latent space. Distances in the latent ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-98355-0_44